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1.  INTRODUCTION 

7i’his  document  provides  a  detailed  description  of  the  lexicon  candidate-selection 
algorithms  used  in  the  Vicens-Reddy  speech  recognition  system  [1].  It  is  a 
sequal  to  SDC  TM-4652/200,  Description  and  Analysis  ci  the  Vicens-Reddy  Pre¬ 
processing  and  Segmentation  Algorithms  [2],  and  SDC  TM-4652/300,  Description 
and  Analysis  of  the  Vicens-Reddy  Recognition  Algorithms  [3],  to  which  the 
reader  is  referred  for  a  description  of  the  terms  and  variables  used. 

The  lexicon  candidate-selection  process  assumes  an  existing  lexicon  as  described 
in  SDC  TM-4652/400,  The  Lexicon  Design  for  the  IBM  360/C7  [4],  and  begins  with 
the  feature  matrix  output  of  the  recognition  process.  Briefly,  the  data 
manipulation  steps  leading  up  to  the  sample  feature  matrix  are: 

Step  1:  The  raw  speech  data  are  digitized  into  10-msec  samples.  Each  time 

slice  is  characterized  by  six  parameters — Al,  7.1,  A2,  Z2,  A3, 

Z3 — which  represent  the  amplitude  and  zero-crossing  counts  for 
each  of  three  frequency  bands:  150-900  Hz,  900-2200  Hz,  and  2200- 
5000  Hz.  All  parameters  have  been  smoothed  and  the  amplitude 
parameters  have  been  normalized  with  respect  to  Al  such  that  the  Al 
range  is  0-63  and  the  A2,  A3  ranges  0-127.  One  unit  of  Zl,  Z2, 
or  Z3  measurement  is  equivalent  to  50  Hz.  The  total  speech  sample 
represented  by  n  10-msec  segments  each  containing  Al,  Zl,  A2,  Z2 , 

A3,  Z3  parameters,  plus  a  closeness  computation,  is  named  the 
Q-matrix. 

Step  2:  The  segmentation  routine  begins  with  the  Q--matrix  and  produces  a 

P-matrix  of  segments  containing  one  or  more  Q-matrix  segments 
combined  on  the  basis  of  similarity  and  closeness. 

Step  3:  The  recognition  routine  begins  with  the  P-matrix  and  may,  on  the 

basis  of  certain  tests,  recombine  P-matrix  segments.  It  arsigns 
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linguistic  labels  to  the  segments.  Recognition  then  creates  the 
feature  matrix  or  R-matrix  from  the  resulting  P-matrix. 

The  R-matrix: 


TYPR(2)  DURR (2)  A1R(2) 


i,4  ri,5  ri,6  ri,7  ri,8  ri,9  ri,10 

Z1R(2)  A2R(2)  Z2R(2)  A3R(2)  Z3R(2)  SXT(2) 


TYPR(n)  DURR(n)  AlR(n) 


ZlR(n)  A2R(n) 


Z2R(n)  A3R(n)  Z3R(n) 


SXT(n) 


where 

rl.l 

rl,2 

rl,3 

rl,4 

rl,5 

rl,6 

rl,7 

rl,8 

rl,9 

rl,10 


the  first  row  is  reserved  for  special  counts: 
•**  number  of  vowels  in  the  message 
*  number  of  fricatives  ir  the  message 
■  vowel-fricative  pattern  in  binary 
=  last  row  number 
=  row  number  of  first  vowel 
=  row  number  of  second  vowel 
=  row  number  of  third  vowel 
=  row  number  of  fourth  vowel 
=  row  number  of  fifth  vowel 
=  unused  position  of  the  array 


The  above  array  is  similar  but  not  identical  to  the  "Construction  of  the 
R-matrix"  on  pp.  25-26  of  [3].  The  R-matrix  described  above  is  the  one 
created  and  used  by  CWIPER  on  the  SDC  IBM  360/67.  The  R-matrix  described 
in  [3]  is  the  one  made  by  the  Vicens-Reddy  system  on  the  PDP10. 
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This  document  describes  a  lexicon  candidate-selection  process  which,  given  a 
feature  matrix  of  the  utterance  representation  to  be  recognized,  selects  a 
list  of  possible  candidates  and  from  that  list  chooses  the  candidate  of  best 
match. 

2.  CANDIDATE-LIST  BUILDING  PROCESS 

Given  a  feature  matrix  representation  of  the  speech  samples  to  be  recognized, 
the  first  task  is  to  select  a  list  of  possible  candidates  from  the  total 
lexicon.  The  subset  selection  operates  serially  in  the  following  three 
stages: 

1.  Elimination  of  all  candidates  whose  relative  positions  of  vowels 
and  fricatives  are  different  from  those  of  the  sample  feature 
matrix. 

2.  Elimination  of  all  candidates  whose  vowel  zero-crossing  charac¬ 
teristics  do  not  pass  similarity  tests  described  below. 

3.  Elimination  of  all  candidates  having  low  vowel-sinilarity  scores 
obtained  by  comparison  to  those  of  the  sample  feature  matrix. 

2.1  SELECTION  OF  THE  INITIAL  LIST  OF  POSSIBLE  CANDIDATES 

The  vowel-fricative  hash*  of  the  sample  yields  a  pointer  to  a  unique  entry  in 
the  VFHASH  table.  This  entry  enables  us  to  begin  a  series  of  links  to  all 
lexicon  entries  having  the  same  number  of  vowels  and  fricatives  as  the  sample. 
Before  one  of  these  lexicon  entries  is  entered  in  the  initial  candidate  list, 
the  lexicon  entry’s  vowel-fricative  pattern  (i.e.,  the  relative  positions 
of  vowels  and  fricatives)  is  matched  against  that  of  the  sample.  If  they  match, 
the  lexicon  entry  is  entered  in  the  possible-candidate  list. 

For  the  definition  of  the  vowel-fricative  hash,  see  [4]. 
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If  this  initial  selection  ot  candidates  fails  (i.e.,  there  are.  no  candidates 
in  the  subset) ,  then  it  is  highly  likely  that  the  segmentation  or  recognition 
algorithms  resulted  in  an  erroneous  or  arbitrary  linguistic  labeling  of  one 
or  more  segments.  An  error  recovery  process  is  initiated  that  attempts  to 
change  the  linguistic  labels  of  segment (s)  that  might  have  been  incorrectly 
labeled  and  to  select  a  list  of  different  candidates.  This  is  discussed  in 
more  detail  in  Section  3.4. 


2.2  ELIMINATION  OF  CANDIDATES  WITH  DIFFERENT  VOWEL  ZERO-CROSSING 

CHARACTERISTICS 

In  [3]  we  discussed  the  assignment  of  vowel  types  on  the  basis  of  zero¬ 
crossing  counts  in  the  first  and  second  frequency  ranges.  To  review* 


Vowel  type 


1  if  Z1  <  6  and  Z2  <  18 

2  if  Z1  <  6  and  18  <  Z2  <  27 

3  if  Z1  <  6  and  Z2  2  27 

4  if  6  £  Z1  <  9  and  Z2  <  18 

5  if  6  £  Z1  <  9  and  18  £  Z2  <  27 

6  if  6  £  21  <  9  and  Z2  >  27 

7  if  zn  9  and  Z2  <  18 

8  if  Z1  >  9  and  18  <  Z2  <  27 

9  if  Z1  >  9  and  Z2  £  27 


Diagramatically,  the  nine  vowel  categories  appear  as  in  Figure  1. 
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(300  Hz)  (450  Hz) 


Figure  1.  The  Nine  Towel  Categories 


Vicens  [1]  has  devised  a  table  defining  rough  dissimilarity  between  pairs  of 
vowels  on  the  basis  of  their  vowel  type  classification. 


Table  1.  Vowel  Dissimilarity  Table 
CANDIDATES 
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For  example  a  vowel  of  vowel  type  3  and  a  vowel  of  vowel  type  5  have  a  crude 
dissimilarity  of  4.  Each  0  entry  in  the  table  indicates  a  prohibited 
correspondence ,  i.e.,  if  a  candidate  vowel  and  a  sample  vowel  are  in  pro¬ 
hibited  correspondence,  the  candidate  is  eliminated  from  the  candidate  list. 
The  procedure  simply  looks  up  in  the  table  the  dissimilarity  values  for  all 
the  sample  vowels  and  the  vowels  of  the  corresponding  candidate.  If  a  pro¬ 
hibited  correspondence  is  detected,  the  candidate  is  eliminated.  Otherwise, 
the  dissimilarity  values  are  added  and  form  an  overall  dissimilarity  value 
characterizing  the  candidate.  This  process  is  repeated  for  all  candidates 
in  the  candidate  list  and  the  list  is  reordered  by  increasing  order  of 

dissimilarity  values.  At  the  end  of  this  process,  the  lexicon  numbers  of 

the  candidates  are  in  the  STACK  table,  their  corresponding  record  numbers 
and  beginning  word  within  the  record  are  in  the  STACKI  table,  and  the  vowel 

dissimilarity  score  is  in  the  STACK2  table.  Again,  if  there  are  no  candidates 

left,  the  error-recovery  routine  is  initiated. 

2.3  ELIMINATION  OF  CANDIDATES  HAVING  LOW  VOWEL-SIMILARITY  SCORES  WITH 

CORRESPONDING  SAMPLE  VOWELS 

For  reasons  of  efficiency,  the  third  attempt  at  reducing  the  candidate  space 
is  implemented  as  part  of  the  segment-mapping  procedure.  It  is  described 
in  detail  in  Section  3.2.3.  Basically,  the  procedure  computes  a  similarity 
score  between  the  vowels  of  the  incoming  message  and  the  vowels  of  each 
candidate  in  the  list,  using  the  segment-similarity  evaluation  function 
described  ir.  Section  3.5.  If  this  score  is  below  a  threshold,  the  entry  is 
eliminated  from  the  candidate  list;  if  it  is  above  the  threshold,  it  is 
retained.  Again,  if  there  are  no  candidates  left,  the  error-recovery  routine 
is  initiated. 

3.  CANDIDATE-SELECTION  PROCESS 

The  previous  sections  have  described  how  a  small  list  of  acceptable  candidates 
can  be  obtained  from  a  large  lexicon  by  various  heuristic  procedures.  This 
section  discusses  how  a  unique  identification  of  sample  feature  matrix 
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can  be  derived  from  this  list  of  acceptable  candidates  b>  similarity 
computation.  The  section  is  divided  into  five  subsections: 

3.1  Overall  description  of  the  candidate  selection  process 

3.2  Segment-mapping  procedure 

3.3  Final  selection 

3.4  Error  recovery  procedure 

3.5  Similarity  evaluation  procedure 

3.1  OVERALL  DESCRIPTION  OF  THE  CANDIDATE-SELECTION  PROCESS 

The  pi..aedure  SERMAP  searches  the  lexicon  and  computes  a  "similarity" 
between  the  sample  feature  matrix  and  all  the  entries  in  the  acceptable- 
candidate  list.  This  similarity  computation  is  performed  first  by  calling 
the  segment-mapping  procedure,  which  creates  linkages  between  the  segments 
of  the  two  feature  matrices  tc  be  matched,  and  then  by  averaging  the 
similarity  values  obtained  for  each  pair  of  linked  segments.  The  results  of 
this  computation  are  stored  for  the  EVALUT  process,  which  chooses  the  best- 
match  candidate.  If  one  of  the  candidates  obtains  a  score  greater  than  or 
equal  to  95  percent,  the  process  immediately  stops  and  returns  the  candidate 
print-name  (excellent -match-candidate  heuristic).  As  the  process  continues, 
modifications  of  the  initial  candidate  list  take  place:  each  time  a  quite 
good  similarity  score  (2802;  is  obtained,  the  list  is  rearranged  so  as  to 
place  next  in  order  all  remaining  candidates  having  the  same  print-name. 

We  assume  that  if  a  candidate  obtains  a  score  of  80%,  it  is  likely  that 
one  of  the  candidates  having  the  same  print-name  will  obtain  95%  or  more. 

In  normal  message  identification,  LISCAN  and  SERMAP  are  called  each  time  a 
new  feature  matrix  of  the  utterance  is  built.  The  initial  feature  matrix  is, 
of  course,  the  feature  matrix  determined  through  the  segmentation  process. 
However,  since  the  error-recovery  procedure  changes  the  original  feature 
matrix  of  the  utterance,  new  calls  to  LISCAN  and  SERMAP  might  be  necessary 
to  investigate  other  parts  of  the  lexicon. 
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The  selecting  process,  which  is  utilized  when  no  candidate  with  a  very  high 
similarity  score  is  found,  is  a  simple  algorithm  acting  on  the  scores  stored 
by  the  matching  process.  Each  candidate  left  in  the  candidate  list  at  this 
point  is  characterized  by  three  scores: 

1.  Similarity  score  for  vowels 

2.  Similarity  score  for  non-vowels 

3.  Overall  similarity  score 

On  the  basis  of  these  three  numerical  values,  EVALUT  chooses  the  best-match 
candidate.  The  first  decision  is  made  on  the  basis  of  the  overall  similarity 
scores.  If  the  overall  scores  of  several  candidates  are  close,  a  second 
decision  is  made  based  on  the  vowel  scores.  If  several  candidates  are  still 
left  after  ^his  second  stage,  the  algorithm  considers  the  non-vowel  scores.* 
If  more  than  one  candidate  is  still  present  in  the  set  of  possible  responses, 
the  candidate  with  the  best  overall  score  is  finally  chosen.  Of  course, 
the  process  terminates  at  any  stage  when  the  number  of  considered  candidates 
reduces  to  1.  The  print-name  of  the  chosen  candidate  is  returned  as  the 
recognition  response  if  it  satisfies  the  acceptability  criterion. 

3.2  SEGMENT-MAPPING  PROCEDURE 

The  mapping  procedure  is  lengthy,  complicated  and  contains  many  heuristics. 

We  are  describing  it  in  detail  in  order  to  get  a  firm  grasp  on  the  heuristics 
involved.  This  should  enable  us  to  make  intelligent  modifications.  Several 
programming  errors  have  been  found  and  will  be  noted. 


A  program  error  was  detected  before  the  final  selection  on  the  basis  of  the 
best  non-vowel  score.  The  counter  K,  which  was  used  to  count  the  candidates 
in  the  first  subset,  was  not  reinitialized  before  the  non-vowel  selection, 
and  the  input  to  the  final  selection  could  contain  candidates  previously 
dropped  on  the  basis  of  a  too-low  vowel  score. 
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3.2.1  EXPAND  Subroutine 

Mapping  begins  by  calling  the  EXPAND  subroutine.  EXPAND  scores  the  10-column 
R-matrix  in  a  12-column  kl -matrix.  This  is  done  so  that  the  original  R-matrix 
can  be  used  to  check  all  remaining  potential  candidates  in  the  candidate  list. 
The  two  additional  columns  are  needed  to  store  pointers  and  weights.  The 
11~—  column  is  labeled  BACPT1  and  is  later  renamed  P0INT1.  The  12—  column 
is  labeled  WT1.  EXPAND  also  takes  the  current  candidate  feature  matrix  from 
the  lexicon  and  stores  this  in  the  so-called  R2 -matrix.  R1  and  R2  are  described 
graphically  i,i  Tables  2  and  3. 

3.2.2  First-Step  Vowel  and  Fricative  Mapping 

A  MAXLNG  is  set  for  later  usage.  This  is  either  the  longest  segment  duration 
of  the  R1  and  R2  matrices  or  24  (whichever  Is  greatest)  plus  ar.  additional  8. 

(We  are  not  aware  of  the  reason  for  the  24  or  8.)  All  A1  parameters  having 
a  zero  value  in  the  R1  and  R2  matrices  are  set  to  1.  The  BACPT1,  WT1,  BACPT2, 
and  WT2  columns  are  initialized  to  zero. 

We  now  want  to  synchronize  the  comparisons  of  rows  of  the  R1  matrix  with  the 
rows  of  the  R2  matrix.  To  do  this,  two  columns,  MAPAR1  and  MAPAR2 ,  are  defined 
and  used  as  follows: 

MAPARl(i)  =  j  and  MAPAR2 (i)  =  k 

means  that  row  j  of  the  R1  matrix  is  mapped  to  row  k  of  the  R2  matrix. 

MAPAR1  and  MAPAR2  are  initialized  as  follows: 

MAPARl(l)  =  1 

MAPAR1(2)  =  1  +  (last  row  number  of  Rl) 

MAPAR1 ( 3 )  =  0 


MAPAR1(60)=  0 
MAPAR2 (1)  =  1 


The  R1  Matrix  is  the  feature  matrix  of  the  sample: 


4  October  1971 


10 


System  Development  Corporation 
TM-4652/500/00 


(U  o  <u  <u 
J3  JZ  J2 
(J  U  U 


H  H  1-4  H 
O  U  H  U 
5  w  H  £ 
o  «!  C  o 
>  E  pm  p4 


Z  Z  &  '  ' 

O  O  O  1-4 

QcS  qtS  QtS  q<S 


II  II  II  H  II  II  II  II  tl 


HNfO'sl’^nvDf^OOO'H 
i—4  i—4  r!  r— I  t— 4  n-4  i— 4  i— 4  i— 4  r~4 

•-/  V./  s-/  V_X  N-/  s,/  V-/ 

5  P 


j^RMimronmn 


4  October  1971 


1 


System  Development  Corporation 
TM-4652/500/00 


HI 

H 


•tJ 

C 


4  October  1971 


12 


System  Development  Corporation 
TM-4652/500/00 


MAPAR2(2)  =  1  +  (last  row  number  of  R2) 

MAPAR2 (3)  =  0 

•  • 

•  • 

•  • 

MAPAR2(60)  “  0 

The  R1  matrix  and  then  the  R2  matrix  are  searched  for  vowels.  Each  time  a 
vowel  is  found,  its  row  number  is  entered  in  the  next  available  location  in 
the  respective  MAPAR1  (for  R1  vowels)  or  MAPAR2  (for  R2  vowels)  table.  If 
the  numbers  of  vowel  entries  in  MAPAR1  and  MAPAJR2  are  different,  the  message 
MAPPING  ERROR  NUMBER  1  CANDIDATE  LEX  NO  XXX 
appears  on  the  user's  terminal  and  this  candidate  is  removed  from  the  list. 
The  same  process  as  described  above  for  vowels  is  then  applied  to  fricative 
segments.  If  the  numbers  of  fricatives  in  MAPAR1  and  MAPAR2  are  different, 
the  message 

MAPPING  ERROR  NUMBER  2  CANDIDATE  LEX  NO  XXX 
appears  on  the  user's  terminal  and  this  candidate  is  removed  from  the  list. 

The  sort  is  then  called  with  the  parameter  MAPARS.  MAPARS  is  the  count  of 
the  entries  in  the  MAPAR1  table  or  the  MAPAR2  table  (i.e.,  each  table  has 
the  same  number  of  entries).  The  sort  does  the  following: 

1.  Sorts  MAFAR1  entries  into  ascending  order, 
i.e. ,  MAPARl(l)  -  1 

MAPAR1(2),. . . ,MAPAR1 (MAPARS- 1)  »  row  numbers  of  vowel 
and  fricative  rows  in  ascending  order 
MAP AR1 (MAPARS)  -  (LAST  ROW  NUMBER  OF  Rl)  +  1 

2.  Each  time  a  position  change  is  made  for  a  MAPaRI  entry,  makes  a 
corresponding  position  change  for  the  corresponding  entry  in 
MAPAR2.  MAPAR2  is  ordered  by  position  and  not  by  the  intrinsic 
value  of  its  entries. 
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3.  Sets  the  BACPT1  column  of  the  PI  matrix:  for  i  =  1,  ...  ,  MAPARS, 
we  set  BACPTl(MAPARl(i))  =  i 

Sets  the  3ACPT2  column  of  the  R2  matrix:  for  i  -  1,  ...  ,  MAPARS, 
we  set  BACPT2 (MAPAR2 ( i ) )  -  i 

4.  Last  tests  the  BACPT2  column  for  entries  in  ascending  order.  If 
the  test  fails,  the  message 

MAPPING  ERROR  NUMBER  3  CANDIDATE  LEXNO  XXX 
appears  on  the  user's  terminal  and  this  candidate  is  removed 
from  the  list. 

The  sort  completes  the  first  crude  mapping  of  vowels  and  fricatives  of  the 
sample  with  the  candidate.  For  any  segment  j  of  the  R1  matrix,  if  BACPTl(j) 

4  0,  then  MAPAB2(BACPTl(j))  «>  segment  number  of  the  R2  matrix  mapped  to 
segment  j.  Likewise,  for  .my  segment  k  of  the  R2  matrix,  if  BACFT2(k)  4  0, 
then  MAPAR1 (BACPT2 (k) )  =  segment  number  of  the  R1  matrix  mapped  to  segmei  t  k, 

3.2.3  Correcting  the  Vowel  Mapping 

This  routine  accomplishes  two  purposes.  First,  ir  a  vowel  is  preceded  or 
followed  by  a  high  consonant  sound  like  /r/  or  /l/,  this  consonant  may  be 
incorrectly  classified  VOWEL,  thus  creating  a  mislinkage  at  this  early 
state  of  the  mapping.  The  same  mislinkage  may  occur  if  the  vowel  is  a 
diphthong,  in  which  case  either  part  of  it  can  be  classified  VCWEL.  To 
correct  this  defect,  this  procedure  redefines  the  links  on  the  basis  of 
similarity  of  parameters  between  the  linked  vowel  segments  and  the  consonant 
or  nasal  segments  adjacent  to  them.  The  similarity  of  parameters  between 
segments  is  defined  by  the  similarity  function,  which  is  described  in 
Section  3.5.  Second,  the  mapping  procedure  continues  by  computing  a 
similarity  score  between  all  the  correctly  mapped  vowels.  If  the  obtained 
score  is  below  a  heuristic  threshold  (i.e.,  the  number  of  vowels  .  (-6)  > 
obtained  similarity  score),  the  candidate  is  eliminated  from  the  candidate 
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list.  This  is  the  third  candidate-spcce-reduction  procedure  mentioned  in 
Section  2.3. 

3.2.4  Map  Everything 

The  program  now  proceeds  by  mapping  the  segments  between  any  two  pairs  of 
mapped  segments  on  the  basis  of  parameter  similarity.  Additional  weighting 
is  given  to  the  similarity  parameter  for  special  conditions  (i.e.,  tfhen  both 
segments  are  vowel  or  both  segments  are  stop  or  both  have  the  same  displacement 
from  the  beginning  or  end  of  the  closest  mapped  segment).  This  process  is 
recursively  repeated  until  the  program  cannot  effect  any  more  mapping. 

3.2.5  Combining  in  Mapping 

As  a  result  of  mapping  everyth'ug,  the  few  remaining  unmapped  segments  are 
then  candidates  for  combination  with  their  preceding  or  following  segments. 

Since  these  second-order  combinations,  in  general,  degrade  both  representations 
to  be  matched,  care  must  be  exercised  in  applying  them.  The  closeness  index 
between  segments  is  computed  using  the  PROXB  function,  which  is  the  same  as 
the  PROXTM  function  defined  in  the  segmentation  procedure  [2]  except  th^t 
it  uses  the  R1  or  R2  matrix  rather  than  the  P-matrix.  On  the  basis  of  the 
closeness  values  between  the  unmapped  segment  and  its  adjacent  segment!,  the 
closer  adjacent  segment  is  chosen.  If  the  closeness  value  between  the 
unmapped  segment  and  the  chosen  segment  is  high  enough,  a  combination  occurs. 

The  combinations  are  done  one  at  a  time  and  in  parallel  on  both  representations. 
Each  time  a  combination  occurs  the  mapping  process  is  re-entered  in  an  attempt 
to  map  the  segment  result  of  the  combining.  This  mapping-combining  process 
is  recursively  repeated  until  no  more  combining  or  mapping  can  be  performed.* 


The  combining  is  a  complicated  process  and  is  accomplished  by  MAP's  calling 
subroutine  COMBN  and  COMBN's  calling  the  COMPRX  and  MX  subroutines.  An 
error  was  detected  in  the  COMPRX  subroutine  in  that  BACPT1  and  BA0PT2  were 
uced  as  if  they  pointed  to  the  mapped  segment  in  the  other  matrix  directly, 
instead  of  indirectly  through  the  MAPAR1  and  MAPAR2  tables. 
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3.2.6  Update  Pointers  in  R1  and  R2  Matrices 

The  pointers  in  the  R1  and  R2  matrices  are  updated  as  follows: 
for  i  =  2 , . . . ,MAPARS-1,  we  set 
POINTl(MAPARl(i) )  =  MAPAR2 (i) 

P0INT2 (MAPAR2 (i) )  =  MAPARl(i) 

It  is  important  to  note  that  this  is  not  just  an  updating,  as  [1]  implies, 
but  a  redefinition  of  the  pointers.  Note  that  P0INT1  is  the  BACPT1  column 
of  R1  and  P0INT2  is  the  BACPT2  column  of  R2.  Previously,  BACPTl(j)  pointed 
to  an  entry  in  the  MAPAR2  table  that  contained  the  segment  number  of  the  R2 
matrix  that  was  mapped  to  the  •*—  segment  in  the  R1  matrix.  Now  POINTl(j)  = 
the  segment  number  of  the  R2  matrix  that  is  mapped  to  the  j—  segment  of  the 
R1  matrix.  Similarly.  P0INT2(k)  =  the  segment  number  of  the  R1  matrix  that 
is  mapped  to  the  k —  segment  of  the  R2  matrix. 

3.2.7  Last  Correction  Vowel  Checking 

A  last  attempt  to  correct  the  vowel  mapping  is  now  made.  This  correction 
is  based  on  a  comparison  between  the  weighted  amplitudes  and  durations  of 
the  mapped  vowel  segments  and  the  weighted  amplitudes  and  durations  of  the 
mapped  segments  preceding  and  following  the  vowel  segments  if  (1)  such 
segments  exist  and  (2)  they  are  transitionals  or  consonants,  or  nasals.  Given 
an  i  segment  in  R1  and  a  j  segment  in  R2 ,  the  weighting  heuristic  is  as 
follows:* 

8-All(i)+4-A21(i)+2*A31(i)+DURl(i)+8*A12(j)+4*A22(j)+2-A32(j)+DUR2 (j) 

A  special  weight  of  60  is  added  to  the  weight  for  a  mapped  vowel  to  a  mapped 
vowel . 


This  weighting  of  amplitudes  is  explained  in  [2]. 
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3.2.8  The  Similarity  Evaluation  Between  R1  and  R2 

The  similarity  evaluation  procedure  computes  the  following  three  scores 
between  R1  and  R2:  the  vowel  similarity  score,  the  non-vowel  similarity 
score,  and  the  overall  similarity  score.  The  vowel  similarity  score  is  a 
sum  of  the  similarity  scores  of  all  of  the  mapped  vowels.  It  is  computed 
by  calling  the  EVAL  subroutine,  which  returns  with  a  similarity  score 
between  two  mapped  segments.  A  proportion  of  this  score  is  added  to  the 
vowel  similarity  score.  The  proportion  is  determined  from  the  weight*  of  the 
mapped  segments.  The  weights  represent  the  ratio  of  the  vowel  segment 
duration  to  the  total  vowel  duration  for  both  R1  and  R2.  The  non-vowel 
similarity  score  is  computed  in  a  like  fashion  except  that  (1)  the  non-vowel 
weights  represent  a  proportion  of  the  non-vowel  duration  to  the  total 
non-vowel  duration  for  K1  and  R2  and  (2)  the  non-vowel  similarity  score  is 
corrected  by  the  sum  of  the  weights  of  the  unmapped  segments.  This 
correction  is  necessary  only  for  non-vowels  because  the  occurrence  of 
unmapped  vowels  would  have  resulted  in  an  unacceptable  candidate  before 
mapping  proceeded  very  far.  The  overall  similarity  score  is  computed  from 
(1)  the  vowel  similarity  score  weighted  by  the  ratio  of  the  sum  of  the  vowel 
durations  to  the  total  duration  for  both  R1  and  R2  and  (2)  the  non -vowel 
similarity  score  weighted  by  the  ratio  of  the  sum  of  the  non-vowel  durations 
to  the  total  duration  for  both  R1  and  R2.  A  correction  factor  representing 
the  proportion  of  difference  i"  duration  (between  R1  and  R2)  to  the  total 
duration  of  R1  and  R2  is  subtracted  from  the  result.  A  detailed  description 
of  the  steps  in  uhis  evaluation  process  is  given  below. 


The  procedure  begins  with  an  examination  of  the  first  segments  of  R1  and  R2. 

_  _  .  ,,  ...  ^  ^  duration+1 

If  they  are  both  stops ,  their  respective  durations  are  set  to  - 2 - • 

The  ]ast  segments  of  R1  and  R2  are  handled  similarly.  R1  and  R2  are  then 
examined  for  unmapped  segments.  If  an  unmapped  transitional  segment  is 

found,  its  duration  is  set  to  zero,  if  an  unmapped  burst  segment  is  found, 

.  .  duration+1 
its  duration  is  set  equal  to - r — - . 
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The  vowel  duration  sum,  VSUM,  is  set  equal  to  the  sum  of  all  the  vowel  dura¬ 
tions  of  R1  and  R2.  The  non-vowel  duration  sum,  RSUM,  is  set  equal  to  the  sum 
of  all  the  non-vowel  durations  of  R1  and  R2.  The  tot,'-1  duration  sum,  ASUM, 
is  the  sum  of  all  the  segment  durations  of  R1  and  R2 .  The  vowel  weight , 

VWT  is  set  equal  to  .  The  non-vowel  weight,  RWT  is  set  equal  to 

ASUM 


100-VWT. 


The  weight  of  each  segment  in  the  R1  and  R2  matrices  is  set  as  follows : 


For  i  =  2,. . .,R0WCT1, 

if  TYPEl(i)  =  vowel,  then  WTl(i)  -  —R1^^1Qj0 

VbUM 

or  if  TYPEl(i)  4  vowel,  then  WTl(i)  - 


and  for  i  =  2,. . . ,R0WCT2 
i .£  TYPE2(i)  ■=  vowel,  then  WT2(i)  = 


or  if  TYPE2(i)  4  vowel,  then  WT2(i)  =  PU^^t*-10-00 


The  vowel  similarity  score,  ANSWE1,  and  the  non-vowel  similarity  score, 
ANSWE2,  are  initialized  to  zero.  The  next  step  xs  to  proceed  through  the 
R1  matrix  and  compute  the  similarity  between  each  R1  mapped  segment  and  its 
R2  segment  using  the  subroutine  EVAL  (see  Section  3.5.2). 

For  i  =  2,...,R0WCT1 

If  POINTl(i)  1  0  (i.e.,  POINTl(i)  is  the  R2  segment) 
set  J1  =  POINTl(i)  then  call  EVAL  (i,Jl) 

Set  POINTl(i)  =  score  (i.e.,  similarity  score  between  i  and  J1  from 
EVAL)  and  set  SCORE  *»  SCORE*  (WTl(i)  +  WT2(J1)) 

Then  if  TYPEl(i)  =  vowel,  set  ANSWE1  =  ANSWEl  +  SCORE 
or  if  TYPEl(i)  ^  vowel,  set  ANSWE2  =  ANSWE2  +  SCORE 
then  set  P0INT2(J1)  = 
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The  variable,  CONSUM,  is  set  equal  to  the  sum  of  the  weights  of  the  unmapped 
segments  5n  R1  jr-1  R2. 


Then: 


CONSUM  -  «ax  (0,  CONSUM-60) 


ANSWEl  = 


ANS$iTE2  = 


AN S WEI 
1000 

l  CONSUM* 100  -ANSWE2 

-\  iooo 


R0WCT1 

ANSWE3  -  y^DURl(i) 
i=2 


R0WCT2 
^  DUR2(j) 
]a2 


then  v 

/  IANSWE3 1 *  100  _  1Q  a) 

AN S WE 3  =  max  y  J - £sUM  ’  ' 


3.2.9  Final  Evaluation 

Set  LIM2  =  35  and  LIM1  =  40.  However,  if  the  sum  of  vowels  and  fricatives 

VWT 

in  the  sample  is  1,  then  LIM1  =  LIM1  +  jr  . 


Then  if: 

ANSWE1  >  LIM1  and  ANSWE2  >  LIM2  and  ANSWE3  £  25  or  ANSWE1  >  80  and 
either  ANSWE2  5  LIM2  or  ANSWE3  >  25  and  the  sum  of  the  vowels  and  fricatives 

is  1,  then  REPLY  =  -  ANSWE3. 

Otherwise  the  candidate  is  not  acceptable,  REPlY  -  0,  and  we  return  to  SERMAP. 

If  REPLY  <  50  and  the  sum  of  the  fricatives  and  vowels  in  the  sample  is 
greater  than  1,  the  candidate  is  not  acceptable,  REPLY  =  0,  and  we  return  to 
SERMAP. 


4  October  1971 


19 


System  Development  Corporation 
TM-4652/500/00 


In  order  to  continue  the  mapping  process,  either  REPLY  must  be  >  50  nr  REPLY 
<  50  and  the  sum  of  vowels  and  fricatives  must  be  equal  to  one. 

If  there  are  already  39  acceptable  candidates  (i.e.  ,  INDEX  =  39)  in  the  STAMA 
cable,  EVALUT  is  called  to  select  the  best  candidate.  The  acceptable  candidate 
count  (INDEX)  is  stepped  by  one  and  the  current  candidate  information  (i.e., 

40  characters  of  the  print  name,  ANSWE1,  ANSWE2,  ANSWE3,  REPLY)  are 
stored  in  the  INDEX  row  of  the  STAMA  table  (see  Table  4).  The  mapping  of 
this  candidate  is  completed  and  successful  and  MAP  returns  to  SEPMAP. 

3.3  FINAL  SELECTION 

Mapping  returns  to  SERMAP  with  REPLY  set  to  the  overall  similarity  score. 

If  the  REPLY  >  0,  the  lexicon  number  and. record  number  and  beginning  word 
number  of  the  sample  are  stored  in  the  INDEX  row  of  STAMA.  If  the  REPLY  >  95, 
the  current  candidate  is  the  selected  candidate  and  a  message  is  printed  on 
the  user's  terminal: 

"YOU  SAID  (PRINT  NAME  OF  CANDIDATE)" 

"LEXNO SESSNO SAMPLE MANNO SC0RE__" 

"PREPR SEGMNT RECOGN MAPPI " 

"CANDIDATES  IN  LIST  WITH  SAME  NAME__" 

CWIPER  then  asks  if  the  sample  is  co  be  inserted  in  the  lexicon.  If  the  answer 
is  yes,  the  sample  is  inserted.  The  program  then  returns  to  the  main  driver 
where  a  new  sample  may  be  analyzed  or  learned  or  the  program  stopped. 

If  SO  s  REPLY  <  95,  the  remaining  candidates  in  the  stack  (if  there  are  more 
than  one)  are  rearranged  so  as  to  place  next  in  order  all  remaining  candidates 
having  the  same  print  name  as  the  sample.  The  mapping  routine  is  again  entered 
to  consider  the  next  candidate  in  the  stack. 
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If  the  tack  is  exhausted  before  a  candidate  with  an  overall  similarity  score 
>  95  is  found,  then  EVALUT  is  called  to  select  the  best  candidate  from  the 
STAMA  table  and  enter  it  in  row  1  of  the  STAMA  table.  A  test  (given  below) 
is  then  made  to  determine  whether  this  selected  candidate  is  similar  enough 
to  be  chosen. 


Given  data  arrays: 

DATA  ARRAYS 


ENTRY 

NUMBER 

ANSI 

ANS4 

1 

72 

80 

2 

68 

80 

3 

63 

80 

4 

58 

80 

Let  N1  =  MIN(4,V0WCT1) .  Then  if  the  selected  candidate's  overall  similarity 
score  5:  ANS4(N1),  the  vowel  similaritv  score  k  ANSl(Nl),  and  the  non-vowel 
similarity  score  ^  ANSl(Nl),  the  candidate  is  the  selected  candidate  and  the 
process  discussed  in  Section  3.2.9  occurs.  If  the  test  is  failed,  then 
we  proceed  to  the  error-recovery  routine  discussed  in  the  following  section. 
However,  keep  in  mind  that  the  current  candidate  information  is  in  the 
first  row  of  the  STAMA  table  and  (if  no  other  adequate  candidate  is  found) 
may  be  the  final  choice. 


3.4  ERROR  RECOVERY  ROUTINE 

The  error  recovery  routine  may  be  entered  whenever  the  candidate  list  has 
been  exhausted  before  an  acceptable  candidate  is  found.  It  is  based  on  the 
assumption  that  a  segmenting  or  labeling  error  in  the  sample  will  lead  to 
an  erroneous  set  of  lexicon  candidates.  The  sample  is  therefore  examined 
for  borderline  cases  of  vowels  (a  nasal,  consonant  or  transitional  might 
have  been  classified  as  a  vowel  or  a  weak  vowel  might  have  been  classified 

f  ) 


4  October  1971 


22 


System  Development  Corporation 
TM-4652/500/00 


as  a  transitional,  consonant  or  nasal)  and  borderline  cases  of  fricatives 
(an  unvoiced  fricative  might  have  been  classified  as  a  burst  or  a  voiced 
fricative  might  have  been  classified  as  a  fricative) .  A  feasibility  value 
is  assigned  to  the  borderline  case  and  the  borderline  case  list  is  arranged 
in  decreasing  order  of  feasibility,  so  that  the  entries  most  likely  to  be 
incorrectly  classified  appear  first  in  the  list. 

Vicens  ran  the  error  recovery  routine  either  until  15  seconds  had  elapsed, 
or  until  the  first  three  borderline  cases  and  their  combinations  had  been 
run,  or  until  a  candidate  with  an  overall  similarity  score  2.  95  was  found. 
(We  discovered  that  the  version  of  the  program  we  had  did  not  cycle  through 
the  three  borderline  cases  and  their  combinations  properly,  and  we  repro¬ 
grammed  this  section  of  it.) 

Assuming  that  three  borderline  segments  1,2,3  have  been  detected  in  the 
sample,  the  following  actions  are  performed: 

Modify  1  and  run  candidate-list  building  and  selection 
Modify  2  and  run  candidate-list  building  and  selection 
Modify  3  and  run  candidate-list  building  and  selection 
Modify  1  and  2  and  run  candidate-list  building  and  selection 
Modify  1  and  3  and  run  candidate-list  building  and  selection 
Modify  2  and  3  and  run  candidate-list  building  and  selection 
Modify  1,  2  and  3  and  run  candidate-list  building  and  selection 

The  actual  heuristics  to  determine  borderline  cases  in  the  program  were  more 
complicated  than  those  described  in  [1]  and  are  discussed  in  Section  3.4.1. 
When  the  selected  changes  and  their  combinations  have  been  run,  a  best 
candidate  may  have  been  found  (either  as  a  result  of  the  error  recovery  or 
from  the  initial  selection)  whose  overall  similarity  score  <  95.  Informa¬ 
tion  about  this  candidate  would  be  in  xw  1  of  vhe  STAMA  table.  If  a 
candidate  had  been  found  with  an  overall  similarity  score  £  95,  the 
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error-recovery  procedure  would  have  stopped  and  that  candidate  would  have  been 
selected. 

(If  no  acceptable  candidate  had  been  found  after  running  the  error-recovery 
routine,  the  user  would  receive  the  terminal  message  "No  acceptable  candidate, 
even  after  modifying  the  sample.")  If  there  is  a  best  candidate  whose  overall 
similarity  score  <  95,  the  candidate's  scores  are  now  put  through  a  series 
of  acceptance  tests  before  the  candidate  is  actually  accepted. 

Given  the  following  data  arrays: 


Entry 

DATA  ARRAY  NAME 

Number 

ANSI 

ANS2 

ANS3 

ANS4 

ANS5 

ANS6 

ANS7 

ANS8 

1 

72 

65 

145 

80 

60 

90 

128 

60 

2 

68 

60 

140 

80 

58 

86 

125 

60 

3 

63 

55 

135 

30 

56 

82 

123 

60 

4 

58 

50 

130 

80 

54 

78 

120 

60 

Let  Ni  =  MIN  (4,V0WCT).  N1  is  the  entry  number  for  the  data  arrays  described 
above.  Ir.  the  following  tests  all  references  to  scores  will  be  to  those 
belonging  to  the  best  candidate  and  are  located  in  row  1  of  the  STAMA  table. 

The  tests  are  performed  in  the  following  order: 

1.  If  the  overall  similarity  score  ^  ANSl(Nl)  and  the  vowel  similarity 
score  >  ANS2(N1),  the  candidate  is  accepted. 

2.  If  the  overall  similarity  score  >  ANS2(N1)  and  the  overall 
similarity  score  +  the  vowel  similarity  score  >  ANS3(N1),  the 
candidate  is  accepted. 
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3.  If  the  overall  similarity  score  i  ANS2(N1),  and  if  either  the 
overall  similarity  score  +  the  vowel  similarity  score  £  ANS7(N1) 
or  the  non-vowel  similarity  score  >  ANS4(Nl)jand  if  the  vowel 
similarity  score  +  the  non-vowel  similarity  score  +  10  2:  ANS7 (Nl) , 
and  if  the  non-vowel  similarity  score  £  ANS8(N1),  the  candidate 

is  accepted. 

4.  If  the  overall  similarity  score  i  ANS5(N1)  and  if  the.  overall 
similarity  score  £  ANS6(N1),  the  candidate  is  accepted. 

Test  4  above  is  superfluous,  for  if  one  examines  the  data  arrays  he  can  see 
that  i*'  order  for  the  overall  similarity  score  to  be  ANS6(i)  (which  is 
the  ultimate  test)  it  would  have  to  be  ^  ANS2(i)  (for  i  ■  1,  2,  3  or  4)  and 
therefore  would  never  have  reached  test  4.  Perhaps  the  data  arrays  ware 
originally  set  to  different  values  and  changed  heuristically  without 
revising  the  program. 

3.4.1  Discussion  of  Changed  Heuristics 

Pages  115-116  of  [1]  discuss  the  heuristics  found  useful  in  defining  the 
borderline  cases  in  error  recovery.  These  differ  from  the  heuristics 
found  in  the  program  in  the  treatment  of  the  duration  parameter.  The 
feasibility  value  of  the  transitional ,  nasal  or  consonant  becoming  a 
vowel  is  given  by: 

Reference  [1]:  A1  +  A2  +  A3  +  DURATION/20  -  90 
Computer  program:  90  -  i  •  DURATION  -  A1  -  A2  -  A3 

Furthermore,  the  selection  of  the  nasal,  consonant  or  transitional  to  be 
changed  is  found  by  bounding  consecucive  segments  of  nasals  and/or  consonants 
by  a  stop,  vowel,  fricative,  or  burst  at  either  end.  One  segment  within 
these  boundaries  is  selected  for  possible  change  in  each  case.  If  one 

of  the  boundary  segments  is  a  vowel,  then  for  each  of  the  segments,  including 

.  ,  DURAT ION+A1+A2+A3  .  .  , 

the  boundarv  segments,  a  sum - „ - is  computed  and  saved. 
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If  a  nasal  or  consonant  is  a  mild  local  maximum,  then  its  sum  will  be  greater 
than  that  of  segments  on  either  side  of  it  and  it  will  be  the  selected  segment 
and  be  given  a  feasibility  number.  If  the  bounding  se6ments  are  both  non¬ 
vowels  ,  then  each  segment  within  the  boundary  is  given  a,  weight  - 

3*  DURATION  +  A1  +  A2,  and  the  segment  with  the  greatest  weight  is  selected 
and  given  a  feas  bility  value. 

If  a  nonstressed  vowel  is  short  or  has  a  low  amplitude  defined  by  the 
conditions  A1  S  45  or  5  *  DURATION  +  A1  <  75),  it  is  a  candidate  for  becoming 
a  consonant.  Its  feasibility  value  i3: 

Reference  [1]:  90  -  A1  -  A2  -  A3  -  DURATION/20 

Computer  program:  5  •  DURATION  +  A1  +  A2  +  A3  -  90 

If  a  fricative  segment  is  short  or  does  not  have  excellent  unvoiced  fricative 
characteristics  (defined  by  the  conditions  DURATION  £  8;  or  5  *  DURATION 
+  Z3  s  110;  or  Z3  <  70  and  A1  +  A2-^2-  S  5;  or  Z3  <  70  and  either  the  segment 
is  not  the  last  or  next-to-last  segment  or  the  DURATION  >  12) ,  it  is  a 
candidate  for  becoming  a  burst.  Its  feasibility  value  is 


Reference  [1]: 


Computer  program: 


90  - 


DURATION 

20 


Z3 


5.DURATION+Z3-9Q 

3 


A3-A1 

2 

A3-A1 

2 


A  burst  segment  with  a  duration  greater  than  40  ms  is  a  candidate  for  becoming 
a  fricative.  Its  feasibility  value  is: 


P.e  erer.ee  []]: 


DURATION 

20 


+  Z3  -  90 

T  ' 


,  A1-A3 

2 


Computer  Program: 


90-5 ■ DURAIT0N-Z3  A1-A3 
3  +  — 2~ 
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If  we  consider  that  the  feasibility  values  in  [1]  might  have  durations  given 
in  milliseconds  and  those  in  the  program  are  given  by  1  unit  =  10  milliseconds, 
the  differences  in  the  treatment  of  duration  are  explained  partially  but 
are  still  basically  different. 

3.5  SIMILARITY  EVALUATION  PROCEDURE 

The  similarity  evaluation  subroutine,  EVAL,  uses  subroutine  GRAD  to  do  the 
actual  closeness  calculation.  In  fact,  EVAL  proceeds  mainly  by  setting 
up  calls  to  GRAD.  Because  the  subroutines  GRAD  and  EVAL  differ  from 
Vicens's  ALGOL  program  listing  [1]  ,  they  are  described  in  detail  here. 

3.5.1  Subroutine  GRAD 

GRAD  is  called  primarily  from  EVAL  but  is  also  called  at  various  times  by 
subroutine  MAP  to  compute  a  closeness  calculation.  The  input  parameters 
to  GRAD,  set  by  the  calling  program,  art; 

MTEMP  — corresponds  to  INFLIM  given  in  [1] 

MDIV  — not  in  ALGOL  listing  in  [1] — appears  to  have  been  added  lauer 

XMUL  — corresponds  to  weight  given  in  [1] 

RMAX  — corresponds  to  RATIOLIM  ^iven  in  [1] 

111  — the  segment  1  parameter  value 

112  — the  segment  2  parameter  value 

The  purpose  of  GRAD  is  to  compute  the  closeness  between  III  and  112  given 
MTEMP,  MDIV,  XMUL,  and  RMAX.  The  FORTRAN  subroutine  GRAD  is  given  in 
Figure  2. 
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SUBROUTINE  GRAD 
IMPLICIT  INTEGER  (A-Z) 

REAL  XMUL,  RMAX,  RATIO 
IF  (III -LT -MTEMP)  III  =  III  +  MTEMP 
IF  (II2’LT’MTEMP)  112  =  112  +  MTEMP 
DIFF  =  IABS  (III  -  112) 

GSUM  =  100 

IF  (MAX0((II1+II2)/MDIV, MTEMP) -GE-DIFF)  RETURN 
GSUM  =  -200 

RATIO  =  (FLOAT (DIFF) *XMUL)/SQRT (FLOAT (II1+II2)) 
IF  (RATIO -GT- RMAX)  RETURN 
GSUM  =  IFIX(1.0- RATIO) *11G.0 
IF  (GSUM-GT • 100)  GSUM  =  100 
IF  (GSUM-LT- -200)  GSUM  =  -200 

RETURN 

END 


Figure  2.  GRAD  FORTRAN  Subroutine 

3.5.2  Subroutine  EVAL 

EVAL  computer  a  similarity  evaluation  score  between  a  segment  in  the  Rl 
matrix  and  a  segment  in  the  R2  matrix.  This  score  is  adaptive  with  respect 
to  the  type  of  segments  to  be  matched.  It  is  composed  of  the  following 
sections,  each  of  which  evaluates  the  type  combinations  appearing  opposite 
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the  section  name.  The  type  combination  consists  of  a  pair  such  as  stop-stop, 
stop-burst,  etc. ,  where  the  first  entry  is  the  R2  type  and  the  second  entry 
is  the  Ri  type. 


Section  Type  Combinations 


EV100 

EV200 

EV300 

EV400 

EV500 

moo 

EV700 


EV800 

EV900 

EV1000 

EV1100 


stop-stop,  stop-burst,  burst-stop 
consonant-stop,  nasal-stop,  stop-consonant,  stop- 
nasal 

consonant-vowel,  nasal -vowel,  vowel-consonant, 
vowel-nasal,  vowel-vowel 

stop-fricative,  burst-burst,  fricative-burst 
consonant-burst,  nasal-burst,  burst-consonant, 
burst-nasal,  burst-vowel,  vowel-burst 
consonant-consonant 

consonant-fricative,  nr.sal-f ricative,  stop-vowel, 

fricative-consonant,  fricative-nasal,  fricative- 

vowel,  vowel-stop,  vowel -fricative 

burst-fricative,  fricative-burst,  fricative- 

fricative 

nasal-nasal 

consonant-nasal 

nasal-consonant 


EVAL  is  called  with  two  input  parameters ,  SEGR1  and  SEGR2 . 
ment  number  in  the  Rl  matrix  and  SEGR2  is  a  segment  number 


SEGR1  is  a  seg- 
in  the  R2  matrix. 


EVAL  begins  by  setting: 

J1  =  SEGR1 
J2  r-  SEGR2 

11  =  TYPEl(Jl) 

12  =  TYPE2 (J2) 
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If  II  is  a  vowel,  set  II  =  6 

If  12  is  a  vowel  set ,  12  =  6 

If  II  is  a  transitional,  set  II  =  1  (consonant) 

If  12  is  a  transitional,  set  12  =  1  (consonant) 

and  computing  ETEMP  =  (12*6)  4  II  -  6  to  find  proper  place  to  evaluate  Jl, 
J2  on  basis  of  types  II,  12. 

EV10Q;  ST0P-STP0,  STOP-BURST,  BURST- STOP 

Step  1  Set:  III  =  DURl(Jl) 

112  =  DUR2 (J2) 

RMAX  =  1.5 
XMUL  =  .625 
MDIV  =  10 
MTEMP  =  2 
CALL  GRAD 
SCORE  =  GSUM 

Step  2  Set:  III  =  All (Jl) 

112  =  A12 (J2) 

RMAX  =  4.0 
XMUL  =  .578125 
CALL  GRAD 

SCORE  =  SCORE  +  GSUM 

Step  3  Set:  III  =  A21(J1) 

112  =  A22 (J  2 ) 

MTEMP  =  4 
CALL  GRAD 
XTEMP  =  3 

If  A31(J1)  +  A32 (J2)  <  6 
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then: 


SCORE  = 


SCORE  -r  GSUM 
XTEMP 


and  RETUIN 

else:  SCORE  =  SCORE  GSUM 


Step  4  Set : 


111  =  Z31(J1) 

112  =  Z32(J2) 

CALL  GRAD 

XTEMP  =  XTEMP+’ 

crnpv  -  SCORE  +  GSUM 
SC0RE  '  XTEMP 

then  RETURN 


EV200:  CONSONANT-STOP,  NASAL-STOP,  STOP-CONSONANT,  STOP-NASAL 

Set:  III  =  All(Jl) 

112  =  A12(J2) 

MTEMP  =  2 

MDIV  =  10 

RMAX  =  2.0 

XMUL  =  .625 

CALL  GRAD 

SCORE  =  GSUM 

If  SCORE  <  0  then  return 

else  go  to  EV100 


EV300:  CONSONANT-VOWEL,  NASAL-VOWEL,  VOWEL-CONSONANT,  VOWEL-NASAL, 

VOWEL-VOWEL 


RMAX1  =  RMAX2  =  0 
XMUL1  =  XMUL2  =  .1 
DIV1  =  4 
DIV2  =  2 


"Hawn’' 
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EV350 


If 


1  All (Jl)  +  A21(J1N  -  A12(J2)  -  A22(J2)j  <  i2 


then:  MTEMP  = 


All (Jl)  +  A21 (Jl)  +  A12(J2)  +  A22(J2) 


else:  MTEMP  =  +  A21(J1)  ,A12(J2)  +  A22(J2); 


RMAX1  =  RMAX1  + 


RMAX2  =  RMAX2  + 


280.0  -  FLOAT ( 2 » MTEMP ) 

100.0 

300.0  -  FLOAT (MTEMP) 


80.0 


DIV1  =  DIV1  + 

DIV2  =  DIV2  + 


80+MTEMP 

13 

100+M1EMP 

13 


XMUL1  =  XMUL1  + 

XMUL2  =  XMUL2  + 


FLOAT (MTEMP )+40.0 
140.0 

FL0AT(MTEMP-2)+40.0 

240.0 


Step  1  Set:  Ill  =  DURl(Jl) 
112  =  DUR2 ( J  2) 
RMAX  =  1.5 
XMUL  =  .625 
MDIV  =  10 
MTEMP  =  2 
Call  GRAD 


10 


SCORE  =  GSUM 
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Step  2 


Step  3 


Step  4 


Set:  III  =  Zll(Jl) 

112  =  Z12(J2) 

RMAX  =  RMAX1 
XMUL  =  XMUL1 
MDIV  =  MDIV1 
MTEMP  =  1 
Call  CRAD 

SCORE  =  SCORE  +  (3‘GSUM) 


Set:  III  =  All(Jl) 

112  =  A12(J2) 

RMAX  =  RMAX2 
XMUL  =  XMUL2 
MTEMP  =  2 
CALL  GRAD 

pci 

SCORE  =  SCORE  +  ~ 


Set : 


A21(J1) • 32 
Ail(Jl) 


112  =  A22(J2)-32 
A12(J2) 


Call  GRAD 


SCORE  =  SCORE  +  GSUM 
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O 


Step  5  Set:  III  = 


A31(J1) • 32 
All(Jl) 

no  =  A32(J2)-32 
A12(J2) 

MTEMP  -  4 

Call  GRAD 

SCORE  =  SCORE  .  - 
XTEMP  =  6 


3UM 


Step  6  If  A21 (Jl)  <  8  and  A22(J2)  <  8  go  to  Step  7 

111  =  Z21(J1) 

112  =  Z22CJ2) 

RMAX  =  RMAX1  +  .1 
XMUL  =  XMUL1  -  .1 
MDIV  =  DIV1-2 
MTEMP  =  2 

Call  GRAD 

SCORE  =  SCORE  +  2-GSUM 
XTEMP  =  XTEMP+2 
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Step  7  If  A31(J1)  <  8  and  A32(J2)  <  8  go  to  Step  8 

111  =  Z31(J1) 

112  =  Z32(J2) 

RMAX  =  RMAX1  +  .2 
XMUL  =  XMUL1  -  .2 
MDIV  =  DIV1  -  3 
MTEMP  =  4 

all  GRAD 

SCORE  =  SCORE  +  GSUM 
XTEMP  =  XTEMP+1 


Step  8 


Score  = 


SCORE 

XTEMP 


then  return 
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EV400 :  STOP-FRICATIVE,  BURST-BURST,  FRICATIVE- BURST 
Step  1  Set:  III  =  DURl(Jl) 

I 12  =  DUR2(J2) 

RMAX  =  1.5 
XMUL  =  .625 
MDIV  =  10 
MTEMP  =  2 
Call  GRAD 

(Note:  In  the  original  computer  program,  the  results  of  Step  1  of  EV400 

are  not  saved.) 

Step  2  Set:  III  =  Zll(Jl) 

112  =  Z12(J2) 

Call  GRAD 

SCORE  =  Max  (25,  GSUM) 

Step  3  Set:  III  =  All(Jl) 

112  =  A12(J2) 

Call  GRAD 

SCORE  =  SCORE  +  GSUM 

Step  4  Set:  III  =  - 

m  =  Z3p)L 

MTEMP  =  4 
Call  GRAD 


SCORE  =  SCORE  +  GSUM 
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Step  5  Set:  111  =  A31(J1) 

112  *  432(J2) 

MTEMP  =  2 
Call  GRAD 

SCORE  n  Max  (40,  SC0^  ^  0SlM) 

Then  return 


EV500;  CONSONANT-BURST,  NASAL-BURST,  BURST-CONSONANT,  BURST-NASAL,  BURST-VOWEL, 
VOWEL-BURST 

Step  1  Set:  III  =  All(Jl) 

112  »  A12(J2) 

RMAX  =  1.5 

XMUL  -  .650 

MDIV  =  10 

MTEMP  =  2 

Call  GRAD 

SCORE  =  GSUM 

If  SCORE  <  <j)  then  return 

else:  If  Min  (All(Jl) ,A12(J2) )  <  4  go  to  EV400 
else  go  to  EV300 


EV6ou:  CONSONANT-CONSONANT 
RMAX1  =  .3 
RMAX 2  =  .5 
XMUL1  =  0 
XMUL2  =  0 
DIV1  =  0 
DIV2  =  0 


Go  to  EV350 
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EV700;  CONSONANT- FRICATIVE,  NASAL- FRICATIVE ,  STOP-VOWEL,  FRICATIVE-CONSONANT, 
FRICATIVE -NASAL,  FRICATIVE-VOWEL,  VOWEL-STOP,  VOWEL- FRIC ATIVE 
SCORE  =  -  1000 
RETURN 

EV800:  BURST- FRICATIVE ,  FRICATIVE-BURST,  FRICATIVE-FRICATIVE 

Step  3  Set:  III  «  DURl(Jl) 

112  =  DUR2(J2) 

RMAX  =  1.5 
XMUL  =  .625 
MDIV  =  10 
MTEMP  =  2 
Call  GRAD 
SCORE  =  GSUM 
DIV1  =  20 

If  SCORE  £  50  then  DIV1  =  50 

Step  2  Set:  III  =  Z3~ 

ii2  =  Z3|(^ 

RMAX  =  4.0 
MTEMP  =  4 
Call  GRAD 


SCORE  =  SCORE  +  2* GSUM 
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Seep  3  Set:  III  =  All(Jl) 

112  =  A12(J2) 

MTEMP  -  2 
Call  GRAD 

SCORE  =  SCORE  +  GSUM 

Step  4  Set:  III  =  r431!!1! 

112  =  ft32ij2.X 

Call  GRAD 
SCOPE  «=  MAX 
RET,IRN 

EV900:  NASAL-NASAL 

If  DURl(Jl)  >  5  and  DUR2(J2)  >  5  go  to  EV600 
else: 

Step  1  Sec:  III  =  All(Jl) 

112  =  A12(J2) 

RMAX  =  2.0 
XMUL  =  .3/5 
MDIV  =  10 
MTEMP  =  2 
Call  GRAD 
SCORE  =  GSUM 


If  SCORE  <  0,  return, 
else  go  to  EV600 
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EV1000:  CONSONANT-NASAL 

If  DURl(J)  <  5  or  Zll(Jl)  >  5  or  A21(J1)*12  >  All(Jl) 
or  A31(J1).10  >  All(Ji) 
then  go  to  EV600 
else  go  to  EV1110 

EV1100:  NASAL-CONSONANT 

If  DUR2 (J2)  <  5  or  Z12(J2)  >  5  or  A22(J2)*12  >  A12(J2) 
or  A32(J2).10  >  A12(J2) 
then  go  to  EV600 
else  go  to  2VL11G 

EV1110 

Step  1  Set:  III  =  All(Jl) 

112  =  A12 (J2) 

RMAX  =  2.0 
XMUL  =  .35 
MDIV  =  10 
MTEMP  =  2 
Call  GRAD 
SCORE  =  G3UM 

If  SCORE  <  0  then  return. 

Else  go  to  EV600 
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